Goto

Collaborating Authors

 Escondido


Task Oriented In-Domain Data Augmentation

Liang, Xiao, Hu, Xinyu, Zuo, Simiao, Gong, Yeyun, Lou, Qiang, Liu, Yi, Huang, Shao-Lun, Jiao, Jian

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have shown superior performance in various applications and fields. To achieve better performance on specialized domains such as law and advertisement, LLMs are often continue pre-trained on in-domain data. However, existing approaches suffer from two major issues. First, in-domain data are scarce compared to general domain-agnostic data. Second, data used for continual pre-training are not task-aware, such that they may not be helpful to downstream applications. We propose TRAIT, a task-oriented in-domain data augmentation framework. Our framework is divided into two parts: in-domain data selection and task-oriented synthetic passage generation. The data selection strategy identifies and selects a large amount of in-domain data from general corpora, and thus significantly enriches domain knowledge in the continual pre-training data. The synthetic passages contain guidance on how to use domain knowledge to answer questions about downstream tasks. We adapt LLMs to two domains: advertisement and math. On average, TRAIT improves LLM performance by 8% in the advertisement domain and 7.5% in the math domain. Large language models (LLMs) have achieved significant performance improvements in various applications such as language modeling (Brown et al., 2020; Touvron et al., 2023; Chowdhery et al., 2023) and visual understanding (Radford et al., 2021). They have also shown superior performance in fields such as finance (Xie et al., 2023b), e-commerce (Ma et al., 2023) and healthcare (Bakhshandeh, 2023). However, the models are usually trained on a large amount of general domain-agnostic data, such as web corpora. Because of the lack of domain-specific training, LLMs suffer from subpar performance when directly applied to certain domains such as advertisement. To adapt LLMs to a specific domain, continual pre-training methods (Gururangan et al., 2020) are commonly applied. In particular, the LLM is continual pre-trained on in-domain corpora, such that it can acquire domain knowledge and better adapt to downstream tasks.


A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis

Gur, Izzeddin, Furuta, Hiroki, Huang, Austin, Safdari, Mustafa, Matsuo, Yutaka, Eck, Douglas, Faust, Aleksandra

arXiv.org Artificial Intelligence

Pre-trained large language models (LLMs) have recently achieved better generalization and sample efficiency in autonomous web automation. However, the performance on real-world websites has still suffered from (1) open domainness, (2) limited context length, and (3) lack of inductive bias on HTML. We introduce WebAgent, an LLM-driven agent that learns from self-experience to complete tasks on real websites following natural language instructions. WebAgent plans ahead by decomposing instructions into canonical sub-instructions, summarizes long HTML documents into task-relevant snippets, and acts on websites via Python programs generated from those. We design WebAgent with Flan-U-PaLM, for grounded code generation, and HTML-T5, new pre-trained LLMs for long HTML documents using local and global attention mechanisms and a mixture of long-span denoising objectives, for planning and summarization. We empirically demonstrate that our modular recipe improves the success on real websites by over 50%, and that HTML-T5 is the best model to solve various HTML understanding tasks; achieving 18.7% higher success rate than the prior method on MiniWoB web automation benchmark, and SoTA performance on Mind2Web, an offline task planning evaluation.


What Do We Really Know About Teaching Kids Math?

The New Yorker

Earlier this week, I wrote about the history of progressive math education, the culture wars it has inspired over the past hundred years, and the controversy over the California Math Framework. Today, I want to start with a much broader question: What do we really know about how to teach math to children? The answer is not all that much--and what little we do know is highly contested. An American math education usually proceeds in a linear fashion, with the idea that one subject prepares you for the next. Take, for example, the typical path through mathematics for a relatively advanced student.


Sewer Monitoring Turns to AI

#artificialintelligence

The vast networks of buried water, wastewater and storm water infrastructure are the veins and arteries feeding our people, our cities and protecting our environment. Without sustainable and viable water, waste and storm water solutions, our quality of life is in peril. Society's water, wastewater and storm water systems have played significant roles in eliminating disease, the safeguarding the environment and protecting communities. Thanks to substantial post-depression and post-World War II investments, most in the U.S. have grown up without the need to give this infrastructure a second thought. We open the taps and a clean, safe and seemingly unlimited supply of water is available to us; our waste is whisked away, treated and returned to the environment; and storm events rarely interrupt our daily lives.


Flight-Simulator Enthusiasts Confident of Real-World Skills

WSJ.com: WSJD - Technology

Last year, two million units of vehicle-simulation games for PCs and consoles were sold world-wide, the most common being flight simulators, according to the market-research firm NPD Group. Josh Edgar, a 25-year-old technology consultant who lives in Dallas, recently spent about $500 on a flight-simulation setup. He said that when he was in grade school, an airline pilot let him briefly step into the cockpit. "Seeing all the lights and switches definitely piqued my curiosity," he said. Howard Penley, a recently retired United Airlines captain, said passengers who once toured his flight deck often would remark on the complexity of all the buttons and dials.